# ViT backbone network
Vit Large Patch16 224.orig In21k
Apache-2.0
A Vision Transformer (ViT) based image classification model, pretrained on ImageNet-21k by Google Research using JAX framework and later ported to PyTorch. Suitable for feature extraction and fine-tuning scenarios.
Image Classification
Transformers

V
timm
584
2
Vit Base Patch32 224.orig In21k
Apache-2.0
An image classification model based on Vision Transformer (ViT), pre-trained on ImageNet-21k, suitable for feature extraction and fine-tuning scenarios.
Image Classification
Transformers

V
timm
438
0
Samvit Huge Patch16.sa1b
Apache-2.0
Segment-Anything Vision Transformer (SAM ViT) image feature model, containing only feature extraction and fine-tuning capabilities, without the segmentation head.
Image Segmentation
Transformers

S
timm
131
1
Vit Base Patch14 Dinov2.lvd142m
Apache-2.0
A Vision Transformer (ViT)-based image feature model, pre-trained using self-supervised DINOv2 method on the LVD-142M dataset
Image Classification
Transformers

V
timm
50.71k
4
Vit Base Patch16 224.mae
Vision Transformer (ViT) based image feature extraction model, pre-trained on ImageNet-1k dataset using self-supervised masked autoencoder (MAE) method
Image Classification
Transformers

V
timm
23.63k
2
Featured Recommended AI Models